Time Series Forecasting¶

Step 01 Introduction¶

As a data scienst team, we are working with a real estate company in Beijing, China. We have been asked to determine whether sales for the real estate company are incerasing or declining.

This Time Series Forecasting project is another side of the other Beijing housing price project, especially regarding foecasting instead of regression.

Data¶

The housing price data is succeeded from step 3 in the other project. In addition, we also use some market prices or index to improve forecasting accuracy.

Step 02 Data Wranging¶

Imports¶

import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.preprocessing import scale

from library.sb_utils import save_file

%matplotlib inline

# Optional Plotly Method Imports
import plotly
import cufflinks as cf
cf.go_offline()

# These for time series statistics
from statsmodels.tsa.stattools import adfuller,kpss,coint,bds,q_stat,grangercausalitytests,levinson_durbin
from statsmodels.graphics.tsaplots import plot_acf,plot_pacf # for determining (p,q) orders
from statsmodels.tsa.seasonal import seasonal_decompose      # for ETS Plots
from pmdarima import auto_arima                              # for determining ARIMA orders


from statsmodels.tools.eval_measures import rmse

Configuration¶

# Set the number of display
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 100)

pd.options.display.float_format = '{:.4f}'.format

Load The Data¶

df = pd.read_csv('../data/df_data_step3a_EDA.csv',parse_dates = ['tradeTime'])
df_sales = pd.read_csv('../data/time_sales_data_step3a_EDA.csv',index_col='tradeTime',parse_dates = ['tradeTime'],)

df_10yrBond = pd.read_csv('../raw_data/China 10-Year Bond Yield Historical Data.csv',index_col='Date',parse_dates = True)
df_hmi  = pd.read_csv('../raw_data/Real Residential Property Prices for China.csv',index_col='DATE',parse_dates = True)
df_sse  = pd.read_csv('../raw_data/SSE Composite Index (000001.SS) Historical Data - Yahoo Finance.csv',index_col='Date',parse_dates = True)

Clean the data¶

China 10-Year Bond Yield Historical Data¶

Data overview¶

df_10yrBond.head()

There are some unnecessary columns.

Type of index¶

df_10yrBond.index

Index(['Jan 05', 'Feb 05', 'Mar 05', 'Apr 05', 'May 05', 'Jun 05', 'Jul 05',
       'Aug 05', 'Sep 05', 'Oct 05',
       ...
       'Feb 21', 'Mar 21', 'Apr 21', 'May 21', 'Jun 21', 'Jul 21', 'Aug 21',
       'Sep 21', 'Oct 21', 'Nov 21'],
      dtype='object', name='Date', length=203)

Frequency is 'MS'.

Clean data and rename the columns¶

We use only Date and price of the 10 years bond. Date is monthly freq, so it be renamed.

df_10yrBond.rename(columns={'Price':'10yrBond',},inplace=True)
df_10yrBond.drop(columns=[ 'Open', 'High', 'Low', 'Change %'],inplace=True)
df_10yrBond.index.names=['month']
# df_10yrBond.index.astype('datetime64[ns]')

df_10yrBond.index

Index(['Jan 05', 'Feb 05', 'Mar 05', 'Apr 05', 'May 05', 'Jun 05', 'Jul 05',
       'Aug 05', 'Sep 05', 'Oct 05',
       ...
       'Feb 21', 'Mar 21', 'Apr 21', 'May 21', 'Jun 21', 'Jul 21', 'Aug 21',
       'Sep 21', 'Oct 21', 'Nov 21'],
      dtype='object', name='month', length=203)

Now, we transform the index format to the same one as the tradeTime in the original housing price data.

# https://stackoverflow.com/questions/58490016/convert-mmm-yy-to-date-time-in-python
df_10yrBond.index = pd.to_datetime(df_10yrBond.index,format='%b %y').strftime('20%y-%m-01')

df_10yrBond.index = pd.to_datetime(df_10yrBond.index)

Make a new dataset, called market, to use in the next steps¶

market = df_10yrBond.copy()

# set index frequency
market.index.freq='MS'

# Visually check the data
market.plot();

China Housing Market Index¶

Data overview¶

# Check the data
df_hmi.head()

Rename the columns¶

# Rename the coluns as hmi, home market index
df_hmi.rename(columns={'QCNR628BIS':'hmi'},inplace=True)
df_hmi.index.names=['quarter']

df_hmi.head()

Add to market dataset¶

# Add and resent the name
market = pd.concat([market, df_hmi],axis=1)
market.index.names=['MS']

The original hmi has only quater based data. To convert it to mothly data, ffill method in fillna function be applied.

market.hmi = market.hmi.fillna(method='ffill')
market.head()

# Check the graph
market['10yrBond'].plot();

SSE Composite Index (000001.SS) from Yahoo Finance¶

SSE (SHANGHAI STOCK EXCHANGE) is one of the major stock index in China.

Data overview¶

# Check the contents in the SSE data.
df_sse.head()

There are some unnecessary columns, which be removed.

# Drop unnecessary columns
df_sse.drop(columns=[ 'Open', 'High', 'Low','Adj Close', 'Volume'],inplace=True)

# Rename the columns
df_sse.index.names=['month']
df_sse.rename(columns={'Close':'sse'},inplace=True)

df_sse.head()

Add to market dataset¶

market = pd.concat([market, df_sse],axis=1)
market.head()

market.info();

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 203 entries, 2005-01-01 to 2021-11-01
Freq: MS
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   10yrBond  203 non-null    float64
 1   hmi       200 non-null    float64
 2   sse       191 non-null    float64
dtypes: float64(3)
memory usage: 6.3 KB

# Check the graph
market.sse.plot();

We will smooth this sse in the same way as 10 years bond.

Total Sales data¶

This is from the file, Regression the step 3 EDA.

Add to market dataset¶

market = pd.concat([market, df_sales[['totalSales']]],axis=1)

# Check the graph
market['totalSales'].plot();

Step 03 Explore The Data¶

Time series data¶

Here, we check visually the relationship between the totalSales and other market prices for the period extended by 2 years, which are from 2010(2012-2) to 2020(2017+2).

# https://stackoverflow.com/questions/20356982/matplotlib-colour-multiple-twinx-axes
def threeTimeSeries(market):
    
    """
    This is a function to make a gragh showing 1 plus three inconsistent units of time series data.
    
    """


    fig, host = plt.subplots(
        figsize=(12, 6)
    )
    fig.subplots_adjust(
#         right=0.75
    )

    par1 = host.twinx()
    par2 = host.twinx()
    par3 = host.twinx()

    # move the spine of the second axes outwards
    par2.spines["right"].set_position(("axes", 1.1))
    par3.spines["right"].set_position(("axes", 1.2))
    

    p1, = host.plot(market.index, market.totalSales, 'r-', label="Total Sales")
    p2, = par1.plot(market.index, market['10yrBond'], 'g-', label="China 10-Year Bond")
    p3, = par2.plot(market.index, market.hmi, 'b-', label="China Housing Market Index")
    p4, = par3.plot(market.index, market['sse'], 'y-', label="SSE Composite Index")

    host.set_xlabel("Time")
    host.set_ylabel("Total Sales")
    par1.set_ylabel("China 10-Year Bond")
    par2.set_ylabel("China Housing Market Index")
    par3.set_ylabel("SSE Composite Index")

#     host.set_xlim(df_10yrBond.index.min(),df_10yrBond.index.max())

    lines = [p1, p2, p3, p4]
    host.legend(lines, [l.get_label() for l in lines],loc='upper left',)

    for ax in [par1, par2, par3]:
        ax.set_frame_on(True)
        ax.patch.set_visible(False)

        plt.setp(ax.spines.values(), visible=False)
        ax.spines["right"].set_visible(True)

    host.yaxis.label.set_color(p1.get_color())
    par1.yaxis.label.set_color(p2.get_color())
    par2.yaxis.label.set_color(p3.get_color())
    par3.yaxis.label.set_color(p4.get_color())

    host.spines["left"].set_edgecolor(p1.get_color())
    par1.spines["right"].set_edgecolor(p2.get_color())
    par2.spines["right"].set_edgecolor(p3.get_color())
    par3.spines["right"].set_edgecolor(p4.get_color())

    host.tick_params(axis='y', colors=p1.get_color())
    par1.tick_params(axis='y', colors=p2.get_color())
    par2.tick_params(axis='y', colors=p3.get_color())
    par3.tick_params(axis='y', colors=p4.get_color())

    host.spines['top'].set_visible(False)

    # host.legend(loc, bbox_to_anchor=(0.6,0.5))

    try:
        # Remove frame color
        frame = host.legend_.get_frame()
        frame.set_facecolor('1.0')
        frame.set_edgecolor('1.0')
    except:
        pass

threeTimeSeries(market.loc['2010-01-01':'2019-12-31'])

There seems to be more correlation between total sales and China 10-year Bond than the others.

Weekly Seasonal Decomposition¶

Here, we check the trend and seasonality of original total sales data, which is not yet smoothed as above.

# Import necessary libraries
from statsmodels.tsa.seasonal import seasonal_decompose

def plotseasonal(col):
    
    fig, axes = plt.subplots(ncols=1, nrows=4, sharex=True, figsize=(12,5))
    res = seasonal_decompose(df_sales[col].dropna())
    res.observed.plot(ax=axes[0], legend=False)
    axes[0].set_ylabel('Observed')
    res.trend.plot(ax=axes[1], legend=False)
    axes[1].set_ylabel('Trend')
    res.seasonal.plot(ax=axes[2], legend=False)
    axes[2].set_ylabel('Seasonal')
    res.resid.plot(ax=axes[3], legend=False)
    axes[3].set_ylabel('Residual')
    plt.tight_layout()
    plt.show()

from IPython import display
from ipywidgets import interact, widgets

interact(plotseasonal,
         col=widgets.ToggleButtons(
             options=list(df_sales.columns),
             value='totalSales',
             description='district:',
             disabled=False,
             button_style='', # 'success', 'info', 'warning', 'danger' or ''
            )
        );

There are trend adn seasonality are found.

Tests for Stationarity¶

We use Augmented Dickey-Fuller Test

from statsmodels.tsa.stattools import adfuller

def adf_test(series,title=''):
    """
    Pass in a time series and an optional title 
    Returns an ADF report
    """
    print(f'Augmented Dickey-Fuller Test: {title}')
    result = adfuller(series.dropna(),autolag='AIC') # .dropna() handles differenced data
    
    labels = ['ADF test statistic','p-value','# lags used','# observations']
    out = pd.Series(result[0:4],index=labels)

    for key,val in result[4].items():
        out[f'critical value ({key})']=val
        
    print(out.to_string())          # .to_string() removes the line "dtype: float64"
    
    if result[1] <= 0.05:
        print("Strong evidence against the null hypothesis")
        print("Reject the null hypothesis")
        print("Data has no unit root and is stationary")
    else:
        print("Weak evidence against the null hypothesis")
        print("Fail to reject the null hypothesis")
        print("Data has a unit root and is non-stationary")
        
# refer to www.pieriandata.com

adf_test(market['totalSales'].dropna(),title='Augmented Dickey-Fuller Test on totalSales Data')

Augmented Dickey-Fuller Test: Augmented Dickey-Fuller Test on totalSales Data
ADF test statistic     -2.8422
p-value                 0.0525
# lags used            12.0000
# observations         60.0000
critical value (1%)    -3.5444
critical value (5%)    -2.9111
critical value (10%)   -2.5932
Weak evidence against the null hypothesis
Fail to reject the null hypothesis
Data has a unit root and is non-stationary

adf_test(market['totalSales'].diff().dropna(),title='Augmented Dickey-Fuller Test on totalSales Data')

Augmented Dickey-Fuller Test: Augmented Dickey-Fuller Test on totalSales Data
ADF test statistic     -1.3499
p-value                 0.6061
# lags used            12.0000
# observations         59.0000
critical value (1%)    -3.5464
critical value (5%)    -2.9119
critical value (10%)   -2.5937
Weak evidence against the null hypothesis
Fail to reject the null hypothesis
Data has a unit root and is non-stationary

One differenciation makes the stationality better.

adf_test(market['totalSales'].diff().diff().dropna(),title='Augmented Dickey-Fuller Test on totalSales Data')

Augmented Dickey-Fuller Test: Augmented Dickey-Fuller Test on totalSales Data
ADF test statistic     -2.1030
p-value                 0.2433
# lags used            12.0000
# observations         58.0000
critical value (1%)    -3.5485
critical value (5%)    -2.9128
critical value (10%)   -2.5941
Weak evidence against the null hypothesis
Fail to reject the null hypothesis
Data has a unit root and is non-stationary

Two differentiation does not necessarily make it better. Therefore, we will use one differenciation in the further modeling.

Change %¶

We use pandas pct_change() method on each column to create a column representing the increase(up) in each market value compared to previous month. The unit is percentage %.

pC = pd.DataFrame()
pC.index.names = ['MS']

for c in market.columns:
    pC[c+'_up%'] = market[c].pct_change()
pC.head()

Time Series - Feature Correlation¶

Here, we check the correlation of feature values and target value(totalSales).

Firstly, we maerge all data into a data set.

sns.pairplot(pC.loc['2012-01-01':'2017-12-31']);

Create a heat map

# Make a function to create a correlations map
def correlations(df,targetLabel,ncol=None,fs=(12,6)):
    
    
    if ncol is None:
        ncol = 1
    nrow = 1
    
    f, ax = plt.subplots(nrow,ncol,figsize=fs)
    
    c = list(df.columns)
    c.remove(targetLabel)
    c.append(targetLabel)

    # Compute the correlation matrix
    corr = df[c].corr()

    # Generate a mask for the upper triangle
    mask = np.triu(np.ones_like(corr, dtype=bool))
    
    # Draw the heatmap with the mask and correct aspect ratio
    sns.heatmap(corr, mask=mask, cmap='coolwarm', vmax=1, vmin=-1, center=0,
                square=True, linewidths=.5, cbar_kws={"shrink": .5},
                annot=True,
                fmt=".1f",
                ax=ax
               )
    plt.tight_layout();

correlations(pC,'totalSales_up%',fs=(5,5));

With regards to totalSales_up%, weak pairwise correlation of the percent changes are found in hmi and sse.

Correlation map By Year¶

In Regression Step 03 EDA, we have already explored the pairwise correlation of columns in the original housing price data(df).

Here, we explore in the same way, but with addional information of following three increase in market values.

'10yrBondSmul12_up%'
'hmi_up%'
'sseSmul12_up%'

df['MS']=pd.to_datetime(df.tradeTime,format='%Y-%m-%dd').dt.strftime('%Y-%m-01').astype(str)
pC = pC.reset_index()
pC.MS = pC.MS.astype(str)
df = df.merge(pC.drop(columns=['totalSales_up%']),how='left',on='MS')
df.drop(columns=['MS'],inplace=True)

mask_df = (df.tradeTime>='2015-01-01') & (df.tradeTime<='2017-12-31')

# Make a list, c, to put price in the last
def mapCorrelationsByYear(df, ncol=None, figsize=(15, 8)):
    """
    Input dataframe as df.
    Return triangular shape corr map By Year
    
    """    
    
    f, axes = plt.subplots(1,len(df.year.unique()),figsize=(11*len(df.year.unique()), 9*len(df.year.unique())))
    
    for i, year in enumerate(df.year.unique()):
        
        if ncol is None:
            ncol = 1
        nrow = 1

        temp=df[df.year==year].drop(columns='year')

        c = list(temp.columns)
        c.remove('price')
        c.append('price')

        # Compute the correlation matrix
        corr = temp[c].corr()

        # Generate a mask for the upper triangle
        mask = np.triu(np.ones_like(corr, dtype=bool))

        # Draw the heatmap with the mask and correct aspect ratio
        sns.heatmap(corr, mask=mask, cmap='coolwarm', vmax=1, vmin=-1, center=0,
                    square=True, linewidths=.5, cbar_kws={"shrink": .1},
                    annot=True,
                    fmt=".1f",
                    ax=axes[i]
                   )
        axes[i].set(title='year= '+str(year))
    
    plt.tight_layout();

mapCorrelationsByYear(df, ncol=None, figsize=(15, 8))

This is slightly hard to compare to see how each feature values affect price. Next, we just focus on the correlations with price.

Corelation bar By Year¶

def makeTrans(df):
    """
    Input dataframe 
    Return a new dataframe used to make 'Corelation bar By Year'.
    """
    for i,year in enumerate(df.year.unique()):
        temp=df[df.year==year].drop(columns='year').corr()
        if i==0:
            corrs = pd.DataFrame(columns=list(temp.columns))
            corrs.loc[year]=temp['price']
        else:
            corrs.loc[year]=temp['price']
    corrs.drop('price',axis=1,inplace=True)
    trans = corrs.T
    trans = trans.reset_index().rename(columns={'index':'features'}).melt('features', var_name='year')
    return trans

trans = makeTrans(df)

# https://github.com/jbmouret/matplotlib_for_papers/blob/master/src/plot_variance_matplotlib_white.py
def clean_layout(ax):
    
    """
    This function is only to make layout as cleaner as possible
    """
#     # Delete unnecessary lines
#     ax.spines['top'].set_visible(False)
#     ax.spines['right'].set_visible(False)
#     ax.spines['left'].set_visible(False)

    try:
        # Remove frame color
        frame = ax.legend_.get_frame()
        frame.set_facecolor('1.0')
        frame.set_edgecolor('1.0')
    except:
        pass

    # y axis grid line
    ax.grid(axis='y', color="0.9", linestyle='-', linewidth=1)
    ax.set_axisbelow(True)

def barCorrelationsByYear(trans):
    
    """
    Input dataframe as df
    Return bar corr map By Year
    
    """
    
    fig, ax = plt.subplots(1,1,figsize=(6,36))
    ax = sns.barplot(data=trans, x="value", hue="year", y="features", palette="Blues");
    
    rects = ax.patches
    
    # Make some labels
    labels = [f"label{i}" for i in range(len(rects))]
    
    # For each bar: Place a label
    for rect in rects:
        
        # Get X and Y placement of label from rect.
        x_value = rect.get_width()
        y_value = rect.get_y() + rect.get_height() / 2
        
        
        # Number of points between bar and label. Change to your liking.
        space = 5
        # Vertical alignment for positive values
        ha = 'left'

        # If value of bar is negative: Place label left of bar
        if x_value < 0:
            # Invert space to place label to the left
            space *= -1
            # Horizontally align label at right
            ha = 'right'

        # Use X value as label and format number with one decimal place
        label = "{:.2f}".format(x_value)

        # Create annotation
        plt.annotate(
            label,                      # Use `label` as label
            (x_value, y_value),         # Place label at end of the bar
            xytext=(space, 0),          # Horizontally shift label by `space`
            textcoords="offset points", # Interpret `xytext` as offset in points
            va='center',                # Vertically center label
            ha=ha)                      # Horizontally align label differently for
                                        # positive and negative values.

#     clean_layout(ax)
    ax.grid(axis='y')

barCorrelationsByYear(trans)

This simple corelation coefficients manifests following relationships.

Subway has the most strong positive relationship with price among the features.
constructionTime shows the most negative strong, meaning that the newer house has the lower price (totalPrice/square). ...showing some democracy

** totalPrice is exclueded here as it equals to square times price.

Useful data for forecasting¶

In oredr to find another useful data to improve forecasting, we perform Granger causality test.

The candidates are sse, bond, and hmi.

The function takes in a 2D [y(first column),x(second column)] and a maximum number of lags to test on x.

With regards to the two series, 𝑦 and 𝑥 , the null hypothesis is that lagged values of 𝑥 do not explain variations in 𝑦 .

totalSales vs 10yrBond¶

# Here y is set as 'totalSales', and x is set as '10yrBond' which is tested to be lagged.
grangercausalitytests(market[['totalSales','10yrBond']].loc['2012-01-01':'2017-12-31'].dropna(),maxlag=12);

Granger Causality
number of lags (no zero) 1
ssr based F test:         F=6.9690  , p=0.0103  , df_denom=68, df_num=1
ssr based chi2 test:   chi2=7.2764  , p=0.0070  , df=1
likelihood ratio test: chi2=6.9272  , p=0.0085  , df=1
parameter F test:         F=6.9690  , p=0.0103  , df_denom=68, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.8771  , p=0.0257  , df_denom=65, df_num=2
ssr based chi2 test:   chi2=8.3506  , p=0.0154  , df=2
likelihood ratio test: chi2=7.8889  , p=0.0194  , df=2
parameter F test:         F=3.8771  , p=0.0257  , df_denom=65, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.0270  , p=0.1193  , df_denom=62, df_num=3
ssr based chi2 test:   chi2=6.7677  , p=0.0797  , df=3
likelihood ratio test: chi2=6.4560  , p=0.0914  , df=3
parameter F test:         F=2.0270  , p=0.1193  , df_denom=62, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=1.5429  , p=0.2016  , df_denom=59, df_num=4
ssr based chi2 test:   chi2=7.1128  , p=0.1300  , df=4
likelihood ratio test: chi2=6.7649  , p=0.1488  , df=4
parameter F test:         F=1.5429  , p=0.2016  , df_denom=59, df_num=4

Granger Causality
number of lags (no zero) 5
ssr based F test:         F=1.0891  , p=0.3766  , df_denom=56, df_num=5
ssr based chi2 test:   chi2=6.5149  , p=0.2593  , df=5
likelihood ratio test: chi2=6.2173  , p=0.2856  , df=5
parameter F test:         F=1.0891  , p=0.3766  , df_denom=56, df_num=5

Granger Causality
number of lags (no zero) 6
ssr based F test:         F=1.7101  , p=0.1368  , df_denom=53, df_num=6
ssr based chi2 test:   chi2=12.7773 , p=0.0467  , df=6
likelihood ratio test: chi2=11.6800 , p=0.0695  , df=6
parameter F test:         F=1.7101  , p=0.1368  , df_denom=53, df_num=6

Granger Causality
number of lags (no zero) 7
ssr based F test:         F=1.5555  , p=0.1707  , df_denom=50, df_num=7
ssr based chi2 test:   chi2=14.1546 , p=0.0485  , df=7
likelihood ratio test: chi2=12.8060 , p=0.0770  , df=7
parameter F test:         F=1.5555  , p=0.1707  , df_denom=50, df_num=7

Granger Causality
number of lags (no zero) 8
ssr based F test:         F=1.2999  , p=0.2668  , df_denom=47, df_num=8
ssr based chi2 test:   chi2=14.1610 , p=0.0777  , df=8
likelihood ratio test: chi2=12.7928 , p=0.1192  , df=8
parameter F test:         F=1.2999  , p=0.2668  , df_denom=47, df_num=8

Granger Causality
number of lags (no zero) 9
ssr based F test:         F=1.3579  , p=0.2363  , df_denom=44, df_num=9
ssr based chi2 test:   chi2=17.4986 , p=0.0415  , df=9
likelihood ratio test: chi2=15.4416 , p=0.0795  , df=9
parameter F test:         F=1.3579  , p=0.2363  , df_denom=44, df_num=9

Granger Causality
number of lags (no zero) 10
ssr based F test:         F=1.0552  , p=0.4173  , df_denom=41, df_num=10
ssr based chi2 test:   chi2=15.9562 , p=0.1009  , df=10
likelihood ratio test: chi2=14.1988 , p=0.1641  , df=10
parameter F test:         F=1.0552  , p=0.4173  , df_denom=41, df_num=10

Granger Causality
number of lags (no zero) 11
ssr based F test:         F=0.9314  , p=0.5217  , df_denom=38, df_num=11
ssr based chi2 test:   chi2=16.4468 , p=0.1253  , df=11
likelihood ratio test: chi2=14.5618 , p=0.2035  , df=11
parameter F test:         F=0.9314  , p=0.5217  , df_denom=38, df_num=11

Granger Causality
number of lags (no zero) 12
ssr based F test:         F=0.8976  , p=0.5580  , df_denom=35, df_num=12
ssr based chi2 test:   chi2=18.4649 , p=0.1023  , df=12
likelihood ratio test: chi2=16.0984 , p=0.1868  , df=12
parameter F test:         F=0.8976  , p=0.5580  , df_denom=35, df_num=12

Granger Causality shows that 1 month lag denies the null hypothesis. So, we can not refuse that 10yrBond explains variations in totalSales upto 2 months.

totalSales vs hmi¶

grangercausalitytests(market[['totalSales','hmi']].loc['2012-01-01':'2017-12-31'].dropna(),maxlag=12);

Granger Causality
number of lags (no zero) 1
ssr based F test:         F=0.1266  , p=0.7230  , df_denom=68, df_num=1
ssr based chi2 test:   chi2=0.1322  , p=0.7161  , df=1
likelihood ratio test: chi2=0.1321  , p=0.7163  , df=1
parameter F test:         F=0.1266  , p=0.7230  , df_denom=68, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=0.9328  , p=0.3986  , df_denom=65, df_num=2
ssr based chi2 test:   chi2=2.0092  , p=0.3662  , df=2
likelihood ratio test: chi2=1.9809  , p=0.3714  , df=2
parameter F test:         F=0.9328  , p=0.3986  , df_denom=65, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=0.9686  , p=0.4133  , df_denom=62, df_num=3
ssr based chi2 test:   chi2=3.2340  , p=0.3569  , df=3
likelihood ratio test: chi2=3.1605  , p=0.3675  , df=3
parameter F test:         F=0.9686  , p=0.4133  , df_denom=62, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=1.1851  , p=0.3268  , df_denom=59, df_num=4
ssr based chi2 test:   chi2=5.4633  , p=0.2430  , df=4
likelihood ratio test: chi2=5.2549  , p=0.2621  , df=4
parameter F test:         F=1.1851  , p=0.3268  , df_denom=59, df_num=4

Granger Causality
number of lags (no zero) 5
ssr based F test:         F=0.9899  , p=0.4323  , df_denom=56, df_num=5
ssr based chi2 test:   chi2=5.9215  , p=0.3139  , df=5
likelihood ratio test: chi2=5.6743  , p=0.3392  , df=5
parameter F test:         F=0.9899  , p=0.4323  , df_denom=56, df_num=5

Granger Causality
number of lags (no zero) 6
ssr based F test:         F=1.8920  , p=0.0994  , df_denom=53, df_num=6
ssr based chi2 test:   chi2=14.1365 , p=0.0281  , df=6
likelihood ratio test: chi2=12.8091 , p=0.0462  , df=6
parameter F test:         F=1.8920  , p=0.0994  , df_denom=53, df_num=6

Granger Causality
number of lags (no zero) 7
ssr based F test:         F=1.6950  , p=0.1317  , df_denom=50, df_num=7
ssr based chi2 test:   chi2=15.4241 , p=0.0309  , df=7
likelihood ratio test: chi2=13.8402 , p=0.0541  , df=7
parameter F test:         F=1.6950  , p=0.1317  , df_denom=50, df_num=7

Granger Causality
number of lags (no zero) 8
ssr based F test:         F=1.4361  , p=0.2068  , df_denom=47, df_num=8
ssr based chi2 test:   chi2=15.6446 , p=0.0478  , df=8
likelihood ratio test: chi2=13.9962 , p=0.0819  , df=8
parameter F test:         F=1.4361  , p=0.2068  , df_denom=47, df_num=8

Granger Causality
number of lags (no zero) 9
ssr based F test:         F=1.3737  , p=0.2291  , df_denom=44, df_num=9
ssr based chi2 test:   chi2=17.7020 , p=0.0388  , df=9
likelihood ratio test: chi2=15.6006 , p=0.0757  , df=9
parameter F test:         F=1.3737  , p=0.2291  , df_denom=44, df_num=9

Granger Causality
number of lags (no zero) 10
ssr based F test:         F=0.8583  , p=0.5777  , df_denom=41, df_num=10
ssr based chi2 test:   chi2=12.9798 , p=0.2248  , df=10
likelihood ratio test: chi2=11.7852 , p=0.2997  , df=10
parameter F test:         F=0.8583  , p=0.5777  , df_denom=41, df_num=10

Granger Causality
number of lags (no zero) 11
ssr based F test:         F=0.7269  , p=0.7060  , df_denom=38, df_num=11
ssr based chi2 test:   chi2=12.8364 , p=0.3042  , df=11
likelihood ratio test: chi2=11.6496 , p=0.3905  , df=11
parameter F test:         F=0.7269  , p=0.7060  , df_denom=38, df_num=11

Granger Causality
number of lags (no zero) 12
ssr based F test:         F=0.6292  , p=0.8029  , df_denom=35, df_num=12
ssr based chi2 test:   chi2=12.9432 , p=0.3732  , df=12
likelihood ratio test: chi2=11.7202 , p=0.4684  , df=12
parameter F test:         F=0.6292  , p=0.8029  , df_denom=35, df_num=12

As for hmi (home market index), there seems to be no causual relationship between 'totalSales' and 'hmi'.

totalSales vs sse¶

grangercausalitytests(market[['totalSales','sse']].loc['2012-01-01':'2017-12-31'].dropna(),maxlag=12);

Granger Causality
number of lags (no zero) 1
ssr based F test:         F=4.2289  , p=0.0436  , df_denom=68, df_num=1
ssr based chi2 test:   chi2=4.4155  , p=0.0356  , df=1
likelihood ratio test: chi2=4.2836  , p=0.0385  , df=1
parameter F test:         F=4.2289  , p=0.0436  , df_denom=68, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=2.0956  , p=0.1312  , df_denom=65, df_num=2
ssr based chi2 test:   chi2=4.5135  , p=0.1047  , df=2
likelihood ratio test: chi2=4.3740  , p=0.1123  , df=2
parameter F test:         F=2.0956  , p=0.1312  , df_denom=65, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.9428  , p=0.0399  , df_denom=62, df_num=3
ssr based chi2 test:   chi2=9.8250  , p=0.0201  , df=3
likelihood ratio test: chi2=9.1855  , p=0.0269  , df=3
parameter F test:         F=2.9428  , p=0.0399  , df_denom=62, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.6816  , p=0.0401  , df_denom=59, df_num=4
ssr based chi2 test:   chi2=12.3626 , p=0.0148  , df=4
likelihood ratio test: chi2=11.3588 , p=0.0228  , df=4
parameter F test:         F=2.6816  , p=0.0401  , df_denom=59, df_num=4

Granger Causality
number of lags (no zero) 5
ssr based F test:         F=1.6968  , p=0.1504  , df_denom=56, df_num=5
ssr based chi2 test:   chi2=10.1508 , p=0.0711  , df=5
likelihood ratio test: chi2=9.4516  , p=0.0924  , df=5
parameter F test:         F=1.6968  , p=0.1504  , df_denom=56, df_num=5

Granger Causality
number of lags (no zero) 6
ssr based F test:         F=1.6482  , p=0.1523  , df_denom=53, df_num=6
ssr based chi2 test:   chi2=12.3147 , p=0.0553  , df=6
likelihood ratio test: chi2=11.2913 , p=0.0798  , df=6
parameter F test:         F=1.6482  , p=0.1523  , df_denom=53, df_num=6

Granger Causality
number of lags (no zero) 7
ssr based F test:         F=2.8126  , p=0.0150  , df_denom=50, df_num=7
ssr based chi2 test:   chi2=25.5944 , p=0.0006  , df=7
likelihood ratio test: chi2=21.5803 , p=0.0030  , df=7
parameter F test:         F=2.8126  , p=0.0150  , df_denom=50, df_num=7

Granger Causality
number of lags (no zero) 8
ssr based F test:         F=3.1740  , p=0.0059  , df_denom=47, df_num=8
ssr based chi2 test:   chi2=34.5760 , p=0.0000  , df=8
likelihood ratio test: chi2=27.6445 , p=0.0005  , df=8
parameter F test:         F=3.1740  , p=0.0059  , df_denom=47, df_num=8

Granger Causality
number of lags (no zero) 9
ssr based F test:         F=2.9657  , p=0.0076  , df_denom=44, df_num=9
ssr based chi2 test:   chi2=38.2168 , p=0.0000  , df=9
likelihood ratio test: chi2=29.8702 , p=0.0005  , df=9
parameter F test:         F=2.9657  , p=0.0076  , df_denom=44, df_num=9

Granger Causality
number of lags (no zero) 10
ssr based F test:         F=2.8202  , p=0.0093  , df_denom=41, df_num=10
ssr based chi2 test:   chi2=42.6470 , p=0.0000  , df=10
likelihood ratio test: chi2=32.4544 , p=0.0003  , df=10
parameter F test:         F=2.8202  , p=0.0093  , df_denom=41, df_num=10

Granger Causality
number of lags (no zero) 11
ssr based F test:         F=2.5894  , p=0.0146  , df_denom=38, df_num=11
ssr based chi2 test:   chi2=45.7230 , p=0.0000  , df=11
likelihood ratio test: chi2=34.1211 , p=0.0003  , df=11
parameter F test:         F=2.5894  , p=0.0146  , df_denom=38, df_num=11

Granger Causality
number of lags (no zero) 12
ssr based F test:         F=2.3093  , p=0.0269  , df_denom=35, df_num=12
ssr based chi2 test:   chi2=47.5052 , p=0.0000  , df=12
likelihood ratio test: chi2=34.9917 , p=0.0005  , df=12
parameter F test:         F=2.3093  , p=0.0269  , df_denom=35, df_num=12

As for sse(stock price), it is 8 months lag that it is the most loikely that Granger Causality denies the null hypothesis. Therefore, we can not refuse that sse explains variations in totalSales.

Among the three, we choose 10 yr bond as the most related feature to sales.

Step 04 Preprocessing and training¶

Split the data into train/test sets¶¶

data = market[['totalSales','10yrBond']].loc['2015-01-01':'2017-12-31']

y = data['totalSales']
exog1 = data['10yrBond']
# len(y),len(exog1)


# This is for Facebook's Prophet.
df_prophet = data[['totalSales']]
df_prophet.reset_index(inplace=True)
df_prophet.columns = ['ds','y']

# Define the split line between train and test data
split=int(len(data) * 0.8) 

# Make train and test variables, with 'train, test'
data_train, data_test = data.iloc[:split], data.iloc[split:]
y_train, y_test = y.iloc[:split].to_frame(), y.iloc[split:].to_frame()

Forecast using only time series data¶

Run pmdarima.auto_arima to obtain recommended orders¶

# For SARIMA Orders we set seasonal=True and pass in an m value

auto_arima(y,
           seasonal=True,
           trace=True,
           error_action='ignore',
           suppress_warnings=True,
          ).summary()

Performing stepwise search to minimize aic
 ARIMA(2,0,2)(0,0,0)[0] intercept   : AIC=1106.350, Time=0.02 sec
 ARIMA(0,0,0)(0,0,0)[0] intercept   : AIC=1110.370, Time=0.01 sec
 ARIMA(1,0,0)(0,0,0)[0] intercept   : AIC=1101.423, Time=0.01 sec
 ARIMA(0,0,1)(0,0,0)[0] intercept   : AIC=1101.358, Time=0.01 sec
 ARIMA(0,0,0)(0,0,0)[0]             : AIC=1165.177, Time=0.00 sec
 ARIMA(1,0,1)(0,0,0)[0] intercept   : AIC=1102.659, Time=0.01 sec
 ARIMA(0,0,2)(0,0,0)[0] intercept   : AIC=1103.701, Time=0.01 sec
 ARIMA(1,0,2)(0,0,0)[0] intercept   : AIC=1104.405, Time=0.02 sec
 ARIMA(0,0,1)(0,0,0)[0]             : AIC=1146.814, Time=0.01 sec

Best model:  ARIMA(0,0,1)(0,0,0)[0] intercept
Total fit time: 0.098 seconds

SARIMA model¶

To make a time-series forecast, firstly we set some parameters necessary for the forecast.

# Load specific forecasting tools
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.metrics import mean_squared_error
from itertools import product

# Make a function called evaluate_sarimax_model
def evaluate_sarimax_model(data, order,seasonal_order, exog=None):
    """
    Input dataframe as data
    and order and seasonal_order to be used in the SARIMAX
    
    Return error_mse,error_rmse, y_test, f(forecast on y_test)
    """
    
    
    # Needs to be an integer because it is later used as an index.
    split=int(len(data) * 0.8)
    
    # Make train and test variables, with 'train, test'
    y_train, y_test = data.iloc[:split], data.iloc[split:]
    past=[x for x in y_train]
    
    
    try:
        exog_train, exog_test = exog.iloc[:split], exog.iloc[split:]
        e=exog_test[0]
    except:
        exog_train, exog_test,e=None,None,None
    
    try:
        exog_forecast = exog_test.to_frame()
        past_exog = [x for x in exog_train]
    except:
        exog_forecast=None
        past_exog=None
        
    
    
    
    # make predictions
    f = list()
    
    for i in range(len(y_test)):#timestep-wise comparison between test data and one-step prediction ARIMA model. 
        model = SARIMAX(past,order=order,seasonal_order=seasonal_order,exog=past_exog,
#                     enforce_invertibility=False,
                   )
        
        model_fit = model.fit(disp=0)
        future = model_fit.forecast(exog=e)[0]
        f.append(future)
        past.append(y_test[i])
        if past_exog!=None:
            past_exog.append(exog_test[i])
            e=exog_test[i]
    
    
    # calculate out of sample error
    error_mse = mean_squared_error(y_test, f)
    error_rmse = rmse(y_test, f)
    
    # Return the error
    return [error_mse,error_rmse, y_test, f]

error_mse, error_rmse, test, f = evaluate_sarimax_model(y,(0,0,1),(0,0,1,12), exog=None)
print('error_mse %.5e, error_rmse %.0f'% (error_mse, error_rmse))

error_mse 2.90493e+11, error_rmse 538974

ModelComparison = pd.DataFrame([{'Model name':'SARIMA with auto_arima',
                                'MAE':error_mse,
                                'Best para':'(0,0,1),(0,0,1,12)',
                                }
                                ])
ModelComparison

Compare predictions to expected values¶

Visualize the results.

# Plot predictions against known values
title ='Monthly Beijing Housing Total sales'
ylabel='Total sales'
xlabel='Month'

# Refit the model
model = SARIMAX(y_train, order=(0,0,1),seasonal_order=(0,0,1,12))
model_fit = model.fit(disp=0)
predictions = model_fit.forecast(len(y_test))


# ax = market['totalSales'].loc['2015-01-01':'2017-12-31'].plot(legend=True,figsize=(12,6),title=title)
ax = y.plot(legend=True,figsize=(12,6),title=title)
predictions.plot(legend=True)
ax.autoscale(axis='x',tight=True)
ax.set(xlabel=xlabel, ylabel=ylabel);

This is the result based on the parameters advised by the auto_arima, and we use this as base model.

Step 05 Modeling¶

SARIMA with Grid Search for different p, d, and q values.¶

I have tried making use of gridsearcv function, but it kept causing error, so here we do manually with for loops.

# Make a function to evaluate different ARIMA models with several different p, d, and q values.

def evaluate_models(y, p_values, d_values, q_values, P_values,D_values,Q_values, m_values,exog=None):
    best_score, best_odrder, best_sesonal_order = float("inf"), None, None
    grid_search = pd.DataFrame(columns=['p','d','q','P','D','Q','m','mse','rmse'])
    # Iterate through p_values
    for p in p_values:
        # Iterate through d_values
        for d in d_values:
            # Iterate through q_values
            for q in q_values:
                
                for P in P_values:
                    
                    for D in D_values:
                        
                        for Q in Q_values:
                            
                            for m in m_values:
                                
                                # p, d, q iterator variables in that order
                                order = (p,d,q)

                                seasonal_order = (P,D,Q,m)
#                                 print('==Now working on order%s,seasonal_order%s==' % (order,seasonal_order))
                                mse,rmse=None,None
                                ps=[p,d,q,P,D,Q,m,mse,rmse]
                                params = {'p':p, 'd':d, 'q':q, 'P':P, 'D':D, 'Q':Q, 'm':m, 'mse':mse,'rmse':rmse}
                                try:
                                    # Make a variable called mse for the Mean squared error
                                    mse,rmse = evaluate_sarimax_model(y, order, seasonal_order,exog=exog)[:2]
                                    grid_search = grid_search.append(pd.Series(ps, index = grid_search.columns), ignore_index=True)
                                    if mse < best_score:
                                        best_score, best_odrder, best_sesonal_order = mse, order, seasonal_order
#                                     print('SARIMAX %s,%s \t\tMSE=%.3f' % (order,seasonal_order,mse))
                                except:
                                    continue
#                                     print('MSE=%.3f RMSE=%.3f' % (mse,rmse))

    
            
    print(grid_search.dropna().sort_values('mse').head(1))                   
    print('Best SARIMAX %s %s MSE=%.3f' % (best_odrder, best_sesonal_order, best_score))
    return [best_odrder, best_sesonal_order, best_score]

p_values = [x for x in range(0, 3)]
d_values = [x for x in range(0, 3)]
q_values = [x for x in range(0, 3)]
P_values = [x for x in range(0, 3)]
D_values = [x for x in range(0, 3)]
Q_values = [x for x in range(0, 3)]
m_values = [3,4,6,12]

import json

try:
    
    with open("../data/sarima.json", 'r') as f:
        best_odrder1, best_sesonal_order1, best_score1 = json.load(f)
except:
    
    import warnings
    warnings.filterwarnings("ignore")
    
    best_odrder1, best_sesonal_order1, best_score1 = evaluate_models(y, p_values, d_values, q_values, P_values,D_values,Q_values, m_values)
    
    with open("../data/sarima.json", 'w') as f:
        # indent=2 is not needed but makes the file human-readable
        json.dump( [best_odrder1, best_sesonal_order1, best_score1], f)

print('According to this grid search, \
the best model parameter is SARIMAX %s %s and MSE is %.5e'\
       % (best_odrder1, best_sesonal_order1, best_score1))

According to this grid search, the best model parameter is SARIMAX [1, 0, 0] [1, 0, 1, 4] and MSE is 5.79619e+10

# Add the metrics to the ModelComparison
def addModel(df1,name,best_score,Best_para):
    # Set a variable for Best para
    
    df2=pd.DataFrame([{'Model name':str(name),
                        'mse':best_score,
#                         'RMSE':np.sqrt(mean_squared_error(y_test,predictions)),
#                         'Explained_variance_score':explained_variance_score(y_test,predictions),
                        'Best para': Best_para,
                      } 
                    ])
    return pd.concat([df1,df2],axis=0)

ModelComparison = addModel(ModelComparison,'SARIMA with Grid Search',best_score1,str(best_odrder1)+','+str(best_sesonal_order1))
ModelComparison

# Plot predictions against known values
title =''
ylabel='Total sales'
xlabel='Month'

# Refit the model
model = SARIMAX(y_train, order=best_odrder1,seasonal_order=best_sesonal_order1)
model_fit = model.fit(disp=0)
predictions = model_fit.forecast(len(y_test))



# ax = market['totalSales'].loc['2015-01-01':'2017-12-31'].plot(legend=True,figsize=(12,6),title=title)
ax = y.plot(legend=True,figsize=(12,6),title=title)
predictions.plot(legend=True)
ax.autoscale(axis='x',tight=True)
ax.set(xlabel=xlabel, ylabel=ylabel);

This result is much better than the previous one.

SARIMAX, SARIMA with an exogenous variable¶

Fit a model¶

Among the three market values, 10yrBond had the lower p-value. Next, we make use of this 10yrBond.

error_mse, error_rmse, test, f = evaluate_sarimax_model(y,exog=exog1,order=best_odrder1,seasonal_order=best_sesonal_order1)
print('error_mse %.5e, error_rmse %.0f'% (error_mse, error_rmse))

error_mse 4.61054e+11, error_rmse 679010

predictions= data_test[[]].copy()
predictions['predictions']=f

# Plot predictions against known values
title =''
ylabel='Total sales'
xlabel='Month'



ax = y.plot(legend=True,figsize=(12,6),title=title)
predictions['predictions'].plot(legend=True)
ax.autoscale(axis='x',tight=True)
ax.set(xlabel=xlabel, ylabel=ylabel);

Grid search¶

import json

try:
    
    with open("../data/sarimax.json", 'r') as f:
        best_odrder2, best_sesonal_order2, best_score2 = json.load(f)
except:
    
    import warnings
    warnings.filterwarnings("ignore")
    
    best_odrder2, best_sesonal_order2, best_score2 = evaluate_models(y, p_values, d_values, q_values, P_values,D_values,Q_values, m_values)
    
    with open("../data/sarimax.json", 'w') as f:
        # indent=2 is not needed but makes the file human-readable
        json.dump( [best_odrder2, best_sesonal_order2, best_score2], f)

error_mse, error_rmse, test, f = evaluate_sarimax_model(y,exog=exog1,order=best_odrder2,seasonal_order=best_sesonal_order2,)
print('error_mse %.5e, error_rmse %.0f'% (error_mse, error_rmse))

/Users/akiofukashima/opt/anaconda3/envs/mypython3/lib/python3.7/site-packages/statsmodels/base/model.py:568: ConvergenceWarning:

Maximum Likelihood optimization failed to converge. Check mle_retvals

/Users/akiofukashima/opt/anaconda3/envs/mypython3/lib/python3.7/site-packages/statsmodels/base/model.py:568: ConvergenceWarning:

Maximum Likelihood optimization failed to converge. Check mle_retvals

/Users/akiofukashima/opt/anaconda3/envs/mypython3/lib/python3.7/site-packages/statsmodels/base/model.py:568: ConvergenceWarning:

Maximum Likelihood optimization failed to converge. Check mle_retvals

/Users/akiofukashima/opt/anaconda3/envs/mypython3/lib/python3.7/site-packages/statsmodels/base/model.py:568: ConvergenceWarning:

Maximum Likelihood optimization failed to converge. Check mle_retvals

error_mse 3.13901e+10, error_rmse 177172

/Users/akiofukashima/opt/anaconda3/envs/mypython3/lib/python3.7/site-packages/statsmodels/base/model.py:568: ConvergenceWarning:

Maximum Likelihood optimization failed to converge. Check mle_retvals

This error is the lowest ever.

ModelComparison = addModel(ModelComparison,'SARIMAX with Grid Search',best_score2,str(best_odrder2)+' '+str(best_sesonal_order2))
ModelComparison

# Refit the model
predictions= data_test[[]].copy()
predictions['predictions']=f

# Plot predictions against known values
title =''
ylabel='Total sales'
xlabel='Month'

ax = y.plot(legend=True,figsize=(12,6),title=title)
predictions['predictions'].plot(legend=True)
ax.autoscale(axis='x',tight=True)
ax.set(xlabel=xlabel, ylabel=ylabel);

Facebeook's Prophet¶

import pandas as pd
from fbprophet import Prophet
%matplotlib inline

m = Prophet(
#     weekly_seasonality=True,
#     daily_seasonality=True,
    seasonality_mode='multiplicative',
)
m.fit(df_prophet)
future = m.make_future_dataframe(3, freq='MS')
forecast = m.predict(future)
fig = m.plot(forecast);

INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.

Initial log joint probability = -48.7188
Iteration  1. Log joint probability =     4.1058. Improved by 52.8246.
Iteration  2. Log joint probability =    33.9404. Improved by 29.8346.
Iteration  3. Log joint probability =     42.811. Improved by 8.87058.
Iteration  4. Log joint probability =    45.7157. Improved by 2.90471.
Iteration  5. Log joint probability =    45.8209. Improved by 0.105114.
Iteration  6. Log joint probability =    45.8358. Improved by 0.0149531.
Iteration  7. Log joint probability =    45.8632. Improved by 0.0273917.
Iteration  8. Log joint probability =     45.885. Improved by 0.0218447.
Iteration  9. Log joint probability =    45.8917. Improved by 0.00669696.
Iteration 10. Log joint probability =    45.9027. Improved by 0.010979.
Iteration 11. Log joint probability =    45.9072. Improved by 0.00445423.
Iteration 12. Log joint probability =    45.9083. Improved by 0.00108658.
Iteration 13. Log joint probability =    45.9125. Improved by 0.00424969.
Iteration 14. Log joint probability =    45.9181. Improved by 0.00553977.
Iteration 15. Log joint probability =    45.9198. Improved by 0.00175283.
Iteration 16. Log joint probability =    45.9199. Improved by 5.04676e-05.
Iteration 17. Log joint probability =    45.9221. Improved by 0.00220224.
Iteration 18. Log joint probability =    45.9249. Improved by 0.0028697.
Iteration 19. Log joint probability =    45.9262. Improved by 0.0012622.
Iteration 20. Log joint probability =    45.9274. Improved by 0.00118812.
Iteration 21. Log joint probability =    45.9283. Improved by 0.000930904.
Iteration 22. Log joint probability =    45.9286. Improved by 0.000277968.
Iteration 23. Log joint probability =    45.9291. Improved by 0.000479469.
Iteration 24. Log joint probability =    45.9293. Improved by 0.000247492.
Iteration 25. Log joint probability =    45.9297. Improved by 0.000351087.
Iteration 26. Log joint probability =    45.9297. Improved by 1.7108e-05.
Iteration 27. Log joint probability =    45.9298. Improved by 0.000142046.
Iteration 28. Log joint probability =    45.9299. Improved by 4.9263e-05.
Iteration 29. Log joint probability =      45.93. Improved by 0.000103664.
Iteration 30. Log joint probability =    45.9301. Improved by 9.11048e-05.
Iteration 31. Log joint probability =    45.9302. Improved by 0.000112431.
Iteration 32. Log joint probability =    45.9303. Improved by 8.39049e-05.
Iteration 33. Log joint probability =    45.9303. Improved by 1.46484e-05.
Iteration 34. Log joint probability =    45.9303. Improved by 9.21706e-06.
Iteration 35. Log joint probability =    45.9303. Improved by 3.65791e-05.
Iteration 36. Log joint probability =    45.9303. Improved by 1.6927e-05.
Iteration 37. Log joint probability =    45.9304. Improved by 1.68595e-05.
Iteration 38. Log joint probability =    45.9304. Improved by 2.80863e-05.
Iteration 39. Log joint probability =    45.9304. Improved by 5.24653e-06.
Iteration 40. Log joint probability =    45.9304. Improved by 2.4171e-06.
Iteration 41. Log joint probability =    45.9304. Improved by 1.08241e-05.
Iteration 42. Log joint probability =    45.9304. Improved by 1.19894e-06.
Iteration 43. Log joint probability =    45.9304. Improved by 3.46692e-06.
Iteration 44. Log joint probability =    45.9304. Improved by 1.81356e-06.
Iteration 45. Log joint probability =    45.9304. Improved by 1.95896e-06.
Iteration 46. Log joint probability =    45.9304. Improved by 1.77057e-06.
Iteration 47. Log joint probability =    45.9304. Improved by 1.58361e-06.
Iteration 48. Log joint probability =    45.9304. Improved by 2.83109e-06.
Iteration 49. Log joint probability =    45.9304. Improved by 1.92458e-06.
Iteration 50. Log joint probability =    45.9304. Improved by 1.24229e-06.
Iteration 51. Log joint probability =    45.9304. Improved by 1.36355e-06.
Iteration 52. Log joint probability =    45.9304. Improved by 1.64097e-06.
Iteration 53. Log joint probability =    45.9304. Improved by 3.89797e-06.
Iteration 54. Log joint probability =    45.9304. Improved by 1.26288e-06.
Iteration 55. Log joint probability =    45.9304. Improved by 8.5713e-08.
Iteration 56. Log joint probability =    45.9304. Improved by 1.08077e-06.
Iteration 57. Log joint probability =    45.9304. Improved by 3.59432e-07.
Iteration 58. Log joint probability =    45.9304. Improved by 5.76157e-07.
Iteration 59. Log joint probability =    45.9304. Improved by 1.78667e-07.
Iteration 60. Log joint probability =    45.9304. Improved by 3.26157e-07.
Iteration 61. Log joint probability =    45.9304. Improved by 4.43667e-08.
Iteration 62. Log joint probability =    45.9304. Improved by 8.05856e-08.
Iteration 63. Log joint probability =    45.9304. Improved by 1.38816e-07.
Iteration 64. Log joint probability =    45.9304. Improved by 4.2026e-08.
Iteration 65. Log joint probability =    45.9304. Improved by 2.86879e-08.
Iteration 66. Log joint probability =    45.9304. Improved by 2.14271e-08.
Iteration 67. Log joint probability =    45.9304. Improved by 4.1393e-08.
Iteration 68. Log joint probability =    45.9304. Improved by 3.12805e-08.
Iteration 69. Log joint probability =    45.9304. Improved by 1.22802e-08.
Iteration 70. Log joint probability =    45.9304. Improved by 3.06564e-08.
Iteration 71. Log joint probability =    45.9304. Improved by 1.42549e-08.
Iteration 72. Log joint probability =    45.9304. Improved by 1.63799e-08.
Iteration 73. Log joint probability =    45.9304. Improved by 1.0816e-08.
Iteration 74. Log joint probability =    45.9304. Improved by 4.14009e-09.

fig = m.plot_components(forecast)

from fbprophet.plot import add_changepoints_to_plot
fig = m.plot(forecast)
a = add_changepoints_to_plot(fig.gca(), m, forecast)

Error Evaluation¶

Here we calcualte the errors to compare with the others.

len(data_train)

28

data_test['totalSales']

2017-05-01    832191.8000
2017-06-01    905690.5000
2017-07-01   1240611.7000
2017-08-01   1307715.0000
2017-09-01   1590959.3000
2017-10-01   1269731.6000
2017-11-01   1491095.8000
2017-12-01   1676201.4000
Freq: MS, Name: totalSales, dtype: float64

start=len(data_train)
end=len(data_train)+len(data_test)-1

predictions = forecast.iloc[start:end+1]['yhat']
print('mse %.5e'% (mean_squared_error(predictions,data_test['totalSales'])))

mse 1.00977e+12

Although the time of preparation and calculation is much faster than the ARIMAX, this score of error was much worse. This may be because the data we used to train the model was too low.

ModelComparison = addModel(ModelComparison,"Facebook's Prophet",mean_squared_error(predictions,data_test['totalSales']),None)
ModelComparison

RNN (Recurrent Neural Network)¶

Scale Data¶

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()

scaler.fit(y_train)

MinMaxScaler()

scaled_train = scaler.transform(y_train)
scaled_test = scaler.transform(y_test)

Time Series Generator¶

from keras.preprocessing.sequence import TimeseriesGenerator

Using TensorFlow backend.

n_input = 12
n_features=1
generator = TimeseriesGenerator(
    scaled_train, 
    scaled_train, 
    length=n_input, 
    batch_size=1
)

Fit a Model¶

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM

# define model
model = Sequential()
model.add(LSTM(150, activation='relu', input_shape=(n_input, n_features)))
model.add(Dense(1)) # Only one predicted result is required. 
model.compile(optimizer='adam', loss='mse')

2021-12-17 09:35:16.892811: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations:  SSE4.1 SSE4.2
To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-17 09:35:16.893324: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 8. Tune using inter_op_parallelism_threads for best performance.

model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_1 (LSTM)                (None, 150)               91200     
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 151       
=================================================================
Total params: 91,351
Trainable params: 91,351
Non-trainable params: 0
_________________________________________________________________

# fit model
model.fit_generator(generator,epochs=100)

Epoch 1/100
16/16 [==============================] - 1s 83ms/step - loss: 0.2782
Epoch 2/100
16/16 [==============================] - 0s 21ms/step - loss: 0.1340
Epoch 3/100
16/16 [==============================] - 0s 22ms/step - loss: 0.1140
Epoch 4/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0871
Epoch 5/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0795
Epoch 6/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0785
Epoch 7/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0807
Epoch 8/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0792
Epoch 9/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0812
Epoch 10/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0835
Epoch 11/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0952
Epoch 12/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0729
Epoch 13/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0826
Epoch 14/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0814
Epoch 15/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0884
Epoch 16/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0789
Epoch 17/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0768
Epoch 18/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0765
Epoch 19/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0779
Epoch 20/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0722
Epoch 21/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0743
Epoch 22/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0743
Epoch 23/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0756
Epoch 24/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0767
Epoch 25/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0755
Epoch 26/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0818
Epoch 27/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0761
Epoch 28/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0724
Epoch 29/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0800
Epoch 30/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0756
Epoch 31/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0774
Epoch 32/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0760
Epoch 33/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0745
Epoch 34/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0726
Epoch 35/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0713
Epoch 36/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0792
Epoch 37/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0806
Epoch 38/100
16/16 [==============================] - 0s 23ms/step - loss: 0.0767
Epoch 39/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0733
Epoch 40/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0713
Epoch 41/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0710
Epoch 42/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0712
Epoch 43/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0723
Epoch 44/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0724
Epoch 45/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0704
Epoch 46/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0739
Epoch 47/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0714
Epoch 48/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0744
Epoch 49/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0680
Epoch 50/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0679
Epoch 51/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0705
Epoch 52/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0736
Epoch 53/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0751
Epoch 54/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0732
Epoch 55/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0707
Epoch 56/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0813
Epoch 57/100
16/16 [==============================] - 0s 23ms/step - loss: 0.0665
Epoch 58/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0668
Epoch 59/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0669
Epoch 60/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0654
Epoch 61/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0659
Epoch 62/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0645
Epoch 63/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0764
Epoch 64/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0624
Epoch 65/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0631
Epoch 66/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0621
Epoch 67/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0619
Epoch 68/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0649
Epoch 69/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0576
Epoch 70/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0616
Epoch 71/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0609
Epoch 72/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0647
Epoch 73/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0636
Epoch 74/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0569
Epoch 75/100
16/16 [==============================] - 0s 24ms/step - loss: 0.0598
Epoch 76/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0565
Epoch 77/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0647
Epoch 78/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0657
Epoch 79/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0668
Epoch 80/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0817
Epoch 81/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0595
Epoch 82/100
16/16 [==============================] - 0s 23ms/step - loss: 0.0646
Epoch 83/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0595
Epoch 84/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0654
Epoch 85/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0555
Epoch 86/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0574
Epoch 87/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0586
Epoch 88/100
16/16 [==============================] - 0s 23ms/step - loss: 0.0590
Epoch 89/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0599
Epoch 90/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0566
Epoch 91/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0560
Epoch 92/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0576
Epoch 93/100
16/16 [==============================] - 0s 23ms/step - loss: 0.0542
Epoch 94/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0579
Epoch 95/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0524
Epoch 96/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0521
Epoch 97/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0521
Epoch 98/100
16/16 [==============================] - 0s 22ms/step - loss: 0.0533
Epoch 99/100
16/16 [==============================] - 0s 20ms/step - loss: 0.0550
Epoch 100/100
16/16 [==============================] - 0s 21ms/step - loss: 0.0676

<keras.callbacks.callbacks.History at 0x7f8c5f069c90>

model.history.history.keys()

dict_keys(['loss'])

loss_per_epoch = model.history.history['loss']
plt.plot(range(len(loss_per_epoch)),loss_per_epoch)

[<matplotlib.lines.Line2D at 0x7f8c5eddd550>]

Evaluate on Test Data¶¶

first_eval_batch = scaled_train[-split:]

first_eval_batch

array([[0.09266475],
       [0.        ],
       [0.25862159],
       [0.32622773],
       [0.31198791],
       [0.29679101],
       [0.27482047],
       [0.29391198],
       [0.23094754],
       [0.26722633],
       [0.34973917],
       [0.56579551],
       [0.54973534],
       [0.36150792],
       [0.79772623],
       [0.35138097],
       [0.39011035],
       [0.42700577],
       [0.65765889],
       [1.        ],
       [0.94408871],
       [0.23328972],
       [0.29493374],
       [0.51558847],
       [0.3699155 ],
       [0.72905584],
       [0.92845269],
       [0.1221668 ]])

first_eval_batch = first_eval_batch.reshape((1, split, n_features))

# Generate predictions into the same time stamps as the test set

test_predictions = []

first_eval_batch = scaled_train[-n_input:]
current_batch = first_eval_batch.reshape((1, n_input, n_features))

for i in range(len(y_test)):
    
    # get prediction 1 time stamp ahead ([0] is for grabbing just the number instead of [array])
    current_pred = model.predict(current_batch)[0]
    
    # store prediction
    test_predictions.append(current_pred) 
    
    # update batch to now include prediction and drop first value
    current_batch = np.append(current_batch[:,1:,:],[[current_pred]],axis=1)

### Inverse Transformations and Compare
test_predictions

[array([0.40705922], dtype=float32),
 array([0.54838717], dtype=float32),
 array([0.8879891], dtype=float32),
 array([1.2820263], dtype=float32),
 array([0.9974545], dtype=float32),
 array([0.39300567], dtype=float32),
 array([0.34618282], dtype=float32),
 array([0.3624613], dtype=float32)]

true_predictions = scaler.inverse_transform(test_predictions)

true_predictions

array([[2420143.24159727],
       [3059390.33692974],
       [4595459.83965787],
       [6377747.99977894],
       [5090588.03779441],
       [2356576.95255   ],
       [2144790.33274164],
       [2218420.2720959 ]])

Create a new dataframe that has both the original test values and your predictions for them.

y_test['Predictions'] = true_predictions
test.head()

2017-05-01    832191.8000
2017-06-01    905690.5000
2017-07-01   1240611.7000
2017-08-01   1307715.0000
2017-09-01   1590959.3000
Freq: MS, Name: totalSales, dtype: float64

y_test.plot(figsize=(12,8));

# Plot predictions against known values
title ='Monthly Beijing Housing Total sales'
ylabel='Total sales'
xlabel='Month'

ax = y.plot(legend=True,figsize=(12,6),title=title)
y_test['Predictions'].plot(legend=True)
ax.autoscale(axis='x',tight=True)
ax.set(xlabel=xlabel, ylabel=ylabel);

Error Evaluation¶

print('mse %.5e'% (mean_squared_error(y_test['Predictions'],y_test.iloc[:,0])))

mse 7.28378e+12

y_test

ModelComparison = addModel(ModelComparison,'RNN',mean_squared_error(y_test['Predictions'],y_test['totalSales']),None)
ModelComparison

Conclusion¶

Based on ModelComparison, the best model was SARIMAX with Grid Search. Next, we use this model.

Step 06 Two months Forecast¶

# Make a 
model = SARIMAX(y,exog=exog1,order=(2,1,2),seasonal_order=(2,0,1,12),enforce_invertibility=False)
results = model.fit(disp=0)


exog_forecast = market[['10yrBond']].loc['2017-12-31':]
fcast = results.predict(len(y)-1,
                        len(y)+2,
                        exog=exog_forecast[:3]).rename('SARIMAX(2, 1, 2) (2, 0, 1, 12) Forecast')

/Users/akiofukashima/opt/anaconda3/envs/mypython3/lib/python3.7/site-packages/statsmodels/base/model.py:568: ConvergenceWarning:

Maximum Likelihood optimization failed to converge. Check mle_retvals

title='2 months Forecast'
xlabel='Month'
ylabel='Total sales'

ax = y.plot(legend=True,figsize=(16,6),title=title)
fcast.plot(legend=True)
ax.autoscale(axis='x',tight=True)
ax.set(xlabel=xlabel, ylabel=ylabel);

Save the data¶

market.index.names=['month']

datapath = '../data'
save_file(df, 'df_data_step3b_EDA.csv', datapath)
save_file(market.reset_index(), 'time_sales_data_step3b_EDA.csv', datapath)

A file already exists with this name.

Do you want to overwrite? (Y/N)y
Writing file.  "../data/df_data_step3b_EDA.csv"
A file already exists with this name.

Do you want to overwrite? (Y/N)y
Writing file.  "../data/time_sales_data_step3b_EDA.csv"

	Price	Open	High	Low	Change %
Date
Jan 05	4.6470	4.6470	4.6470	4.6470	2.51%
Feb 05	4.6720	4.6720	4.6720	4.6720	0.54%
Mar 05	4.4110	4.4110	4.4110	4.4110	-5.59%
Apr 05	4.3570	4.3570	4.3570	4.3570	-1.22%
May 05	4.1330	4.1330	4.1330	4.1330	-5.14%

	QCNR628BIS
DATE
2005-04-01	87.9389
2005-07-01	89.5385
2005-10-01	89.7531
2006-01-01	88.5259
2006-04-01	90.6989

	hmi
quarter
2005-04-01	87.9389
2005-07-01	89.5385
2005-10-01	89.7531
2006-01-01	88.5259
2006-04-01	90.6989

	10yrBond	hmi
MS
2005-01-01	4.6470	NaN
2005-02-01	4.6720	NaN
2005-03-01	4.4110	NaN
2005-04-01	4.3570	87.9389
2005-05-01	4.1330	87.9389

	Open	High	Low	Close	Adj Close	Volume
Date
2005-01-01	1260.7820	1268.8560	1189.2111	1191.8230	1191.8230	197200
2005-02-01	1189.5010	1328.5330	1187.2600	1306.0031	1306.0031	223200
2005-03-01	1305.2490	1326.0760	1162.0310	1181.2360	1181.2360	310600
2005-04-01	1180.3730	1254.3190	1135.7330	1159.1460	1159.1460	319000
2005-05-01	1160.6200	1165.3900	1043.2760	1060.7380	1060.7380	181000

	sse
month
2005-01-01	1191.8230
2005-02-01	1306.0031
2005-03-01	1181.2360
2005-04-01	1159.1460
2005-05-01	1060.7380

	10yrBond_up%	hmi_up%	sse_up%	totalSales_up%
MS
2005-01-01	NaN	NaN	NaN	NaN
2005-02-01	0.0054	NaN	0.0958	NaN
2005-03-01	-0.0559	NaN	-0.0955	NaN
2005-04-01	-0.0122	NaN	-0.0187	NaN
2005-05-01	-0.0514	0.0000	-0.0849	NaN

Dep. Variable:	y	No. Observations:	36
Model:	SARIMAX(0, 0, 1)	Log Likelihood	-547.679
Date:	Fri, 17 Dec 2021	AIC	1101.358
Time:	09:35:07	BIC	1106.108
Sample:	0	HQIC	1103.016
	- 36
Covariance Type:	opg

	coef	std err	z	P>\|z\|	[0.025	0.975]
intercept	2.256e+06	3.32e+05	6.800	0.000	1.61e+06	2.91e+06
ma.L1	0.5502	0.179	3.074	0.002	0.199	0.901
sigma2	1.116e+12	0.102	1.09e+13	0.000	1.12e+12	1.12e+12

Ljung-Box (L1) (Q):	0.23	Jarque-Bera (JB):	1.69
Prob(Q):	0.63	Prob(JB):	0.43
Heteroskedasticity (H):	3.34	Skew:	0.53
Prob(H) (two-sided):	0.05	Kurtosis:	2.84

	Model name	MAE	Best para	mse
0	SARIMA with auto_arima	290493316389.1049	(0,0,1),(0,0,1,12)	NaN
0	SARIMA with Grid Search	NaN	[1, 0, 0],[1, 0, 1, 4]	57961923350.2968

	totalSales	Predictions
2017-05-01	832191.8000	2420143.2416
2017-06-01	905690.5000	3059390.3369
2017-07-01	1240611.7000	4595459.8397
2017-08-01	1307715.0000	6377747.9998
2017-09-01	1590959.3000	5090588.0378
2017-10-01	1269731.6000	2356576.9526
2017-11-01	1491095.8000	2144790.3327
2017-12-01	1676201.4000	2218420.2721